Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: reorder upload queue when possible #10218

Draft
wants to merge 1 commit into
base: erik/assert-upload-index
Choose a base branch
from

Conversation

erikgrinaker
Copy link
Contributor

@erikgrinaker erikgrinaker commented Dec 20, 2024

Problem

The upload queue currently sees significant head-of-line blocking. For example, index uploads act as upload barriers, and for every layer flush we schedule a layer and index upload, which effectively serializes layer uploads.

Resolves #10096.
Requires #10228.

Summary of changes

Allow upload queue operations to bypass the queue if they don't conflict with preceding operations.

upload_queue.num_inprogress_deletions == upload_queue.inprogress_tasks.len()
}
/// TODO: consider moving this and other associated logic into UploadOp and UploadQueue.
fn can_bypass(a: &UploadOp, b: &UploadOp) -> bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's unit test this against the most important invariants:

  • A layer upload must happen before an index that references it
  • A layer deletion must happen after an index that de-references it
  • If a layer name that is re-used, the second upload must come after an index that de-references the earlier layer of the same name
  • ...whichever others we can think of

Copy link

github-actions bot commented Dec 20, 2024

7095 tests run: 6795 passed, 3 failed, 297 skipped (full report)


Failures on Postgres 17

# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_timeline_archival_chaos[release-pg17] or test_timeline_archival_chaos[release-pg17] or test_timeline_archival_chaos[release-pg17]"
Flaky tests (4)

Postgres 17

Test coverage report is not available

The comment gets automatically updated with the latest test results
c09b6a7 at 2024-12-30T15:07:57.923Z :recycle:

@skyzh skyzh self-requested a review December 20, 2024 22:56
@erikgrinaker erikgrinaker force-pushed the erik/upload-reorder branch 4 times, most recently from fdb6fbb to b032c18 Compare December 27, 2024 16:38
@erikgrinaker erikgrinaker changed the base branch from main to erik/assert-upload-index December 30, 2024 09:38
@erikgrinaker erikgrinaker force-pushed the erik/upload-reorder branch 2 times, most recently from 4eea68f to c09b6a7 Compare December 30, 2024 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pageserver: improve flush upload queue parallelism
2 participants